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Abstract 



The present research derives simplified formulae for computing the standard 
error of the frequency estimation method for equating score distributions that are 
continuized using a uniform or Gaussian kernel function (Holland, King & Thayer, 
1989; Holland & Thayer, 1987). The simplified formulae are applicable to equating 
both the observed- and smoothed-score distributions (Rosenbaum & Thayer, 1987). 
Results from two empirical studies indicate that the simplified formulae work rea- 
sonably well for samples with moderate sizes, say one thousand examinees. 

Key words: equipercentile equating, frequency estimation method, kernel equating, 
log-linear models, standard errors. 
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Introduction 



Equipercentile equating defines that score x on Form— X and score e(x) on 
Form— Y are equivalent via the function: 

(1) e(x) = G-^[F(x)], 



where F and G denote the distribution functions of the respective scores on Forms 
— X and — Y in the reference population. Because observed scores are discrete in- 
tegers, the equipercentile equating function is not well-defined unless F and G are 
continuized. Let i and k denote integer scores on Forms —X and — Y, respectively. 
Conventionally, all repetitions of integer scores i and k are assumed to be uni- 
formly distributed in well-defined ranges, for instances, i — 0.5 < x < z -t- 0.5 and 
— 0.5 < y < ^:-t-0.5, where x and y denote continuous scores. Based on this notion, 
an equipercentile equivalent can be defined as: 



( 2 ) 



eu(x) 



G-^[F.{x)] 



k-0.5 + 



Fu(x) - G{k - 1) 
G{k)-G{k-1) ’ 



(Lord, 1965), where k is an integer score such that G{k — 1) < Eu(x) < G{k)] F^ 
and Gu are the continuized F and G respectively based on the uniform assumption. 

Holland and Thayer (1989) introduced a kernel method of continuizing 
observed-score distributions which includes the uniform assumption as a special case 
[i.e., (2) can be obtained by using a uniform kernel]. Holland and Thayer (1989) 
also suggested using a Gaussian kernel in the continuization phase. Specifically, the 



Gaussian kernel scheme is defined to be 



(3) F^{x) = 

i 

where Fc{x) denotes the continuized F evaluated at x; f{i), the discrete density of 
the integer score i ; the standard normal cdf evaluated at W{x which is a 

linear function of i and x with parameter Ax = [o’x/(o'x + ^x)]^ • In (3), /Zx and 
cTx are population mean and variance, respectively; the constant Bx is a so-called 
bandwidth for the Gaussian kernel function. Likewise, G can also be continuized 
using the Gaussian kernel. By analogy to (2), the equating function based on the 
Gaussian kernel function is defined to be: 

where Gc denotes the continuized G. For simplicity, equipercentile equating based 
on (2) and (4) will be refered to as the uniform and Gaussian kernel methods, 
respectively. 

In equating practice, unknown F and G in (1) must be empirically esti- 
mated from the samples before the continuization phase is performed. For security 
or disclosure considerations. Forms —X and — Y are normally administered to two 
naturally occurring groups along with a set of common items. The F and G es- 
timates are then adjusted for sample-selection bias using score information on the 
common-items. The frequency estimation (FE) method (Angoff, 1984) is a device 
for estimating F and G under the common-item design and its use has been rec- 
ommended in many equating studies (e.g., Braun & Holland, 1982; Holland, King 
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& Thayer, 1989). The standard error of the FE method has been derived by Jar- 
joura and Kolen (1985) and Holland, King and Thayer (1989) for the uniform and 
Gaussian kernel equating functions, respectively. Both formulae were derived on 
the basis of the first-order Taylor approximations to eu(x) and Cc(x), respectively. 
For small sample equating, the bivariate distributions of scores on the main test 
form and common-items may be smoothed using log-linear models (Rosenbaum & 
Thayer, 1987) to reduce sampling errors in equating results. However, the error of 
the FE method for equating smoothed distributions becomes computationally te- 
dious for practitioners. In this study, we propose simplified formulae for estimating 
the standard errors of kernel equating methods; in the formulae, those complicated 
derivatives resulting from the first-order Taylor approximations are bypassed via 
their large sample approximations. The simplified formulae are applicable to equat- 
ing observed- and smoothed-score distributions. In the next section, a brief review 
will be devoted to the common-item equipercentile equating methods. The standard 
error formulae for equating observed- and smoothed-score distributions will then be 
derived for the uniform and Gaussian kernel methods separately. Finally, the accu- 
racy of these proposed formulae will be evaluated through two ertipirical studies. 



The Common-Item Equipercentile Equating 



Let j be a score on the common items with distribution function H and density 
h. For ease of discussion, it is assumed that common-item scores do not count 
toward total test scores. According to the conditional homogeneity assumption, the 
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conditional distribution of i on X given j (likewise, the conditional distribution of 
^ on Y given j ) is the same for Forms —X and — Y groups in the population. The 
discrete densities of marginal i and k in the reference population can be estimated 



from the samples by: 

(5) /(O = and 

3 

( 6 ) m = Y,9,w)k}), 

3 

where the subscripts x and y denote the sample estimates based on data from Forms 
-X and -Y groups, respectively; k{j) = 'rK{j)+{^-l)hyij) with 7 = Nx/(Nx+Ny), 
where /ix and hy are the respective sample frequencies of the common-item score j ; 
Nx and Ny are the sample sizes. 

In the uniform kernel method, the marginal distributions of x and y are 
estimated by: 

(7) f{i) + f{io){x -io + 0.5), and 

»<»0 

(8) GM = 9{k) + g{ko){y - ko + 0.5), 

k<ko 

where i, io, k, and ko are integer scores; x and y are continuous scores in the ranges 
io — 0.5 < X < io -f 0.5 and ko — 0.5 < y < ko -f 0.5, respectively. In the 
Gaussian kernel method, on the other hand, the marginal distributions of x and y 
are estimated by: 



( 9 ) 



Wix — 



^c{x) = ^ /(z)$(iyii:), where 

t 

X (1 ^x)f^x 



AxBx 



, and = 



^5-2 -f B2^ 



(10) 



Gc{y) =Y^9{k)^{'ipky), where 
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Note that and dy are sample means and variances of Forms —X and — Y 

scores in their respective groups. After score distributions are continuized using 
(7) through (10), the equipercentile equivalents can be found using the equating 
functions (2) and (4), respectively. 

The observed distributions of discrete scores are often sparse when sample 
sizes are small, and equipercentile equating using sparse distribution functions tends 
to be unstable and inaccurate. Therefore, the presmoothing of the sample bivari- 
ate (i,j) and (k^j) tables using the log-linear models have been recommended for 
improving the accuracy of equipercentile equating (Holland & Thayer, 1987; Rosen- 
baum & Thayer, 1987). Let f{i,j) denote the joint density of i and j scores on 
Form— X and common items, respectively. The log-linear model assumes that: 

(11) log /(*5 i) = ^0 + ^ A(**) + ^ ’) + ^2g+l(u)5 

t=l t=g+l 

where /3o is a normalizing constant selected to make the sum of equal one 

(Rosenbaum & Thayer, 1987). The maximum likelihood estimates (MLE’s) of the 
^’s in (11) have the property that the first q fitted univariate moments and the fitted 
correlation equal their corresponding moments and correlation observed in the (i,j) 
sample (Holland & Thayer, 1987). By analogy, the (k,j) table for Form— Y and 
common items can also be smoothed using a model similar to (11). We denote 
and as the vectors of parameter estimates for smoothing the (i, j) and {k,j) 
tables, respectively. The FE method using the smoothed densities can be expressed 



as: 



(12) /(O = Z)/x(«|i)Mj)) and 

3 

(13) m) = T,gymm3i 

3 

where /x(«|i) and gy{k\j) are smoothed conditional densities of i and k given j, and 
k{j) = 7^x(j) + (1 — 'y)hy(j) where hx{j) and hy{j) are the smoothed marginal 
densities on j in the and (k,j) tables, respectively. After the smoothed den- 

sities are obtained via (12) and (13), the uniform or Gaussian kernel function can 
then be used to continuize the discrete distributions and solve for the equipercentile 
equivalent in (1). 

In summary, this study considers four methods of equipercentile equating, which 
involve the combinations of two types of score distributions (observed vs. smoothed), 
and two types of continuization procedures (uniform vs. Gaussian kernel methods). 
In the next two sections, the standard errors of these four equating methods will be 
derived using a technique based on the Bahadur theorem which was first introduced 
by Liou and Cheng (1995a). 

Standard Error of The Uniform Kernel Method 

Let = eu(z) = G~^[Fu(z)] and denote its sample estimate by Liou and 
Cheng (1995a) used the Bahadur Theorem (1966) to derive a general expression for 
the standard error of as follows: 

(14) [Var{Q]'^ S {Var[F^{x)] + Var[G4U)]~ 




6 



2Co»[f’„(i).G„(|„)]}Vj(fu), 

where Fu and Gu are defined in (7) and (8), respectively, and g{^n) = dGu{t)/dt 
evaluated at t — ^u- The expression in (14) holds when the first derivatives of 
Fu and exist almost everywhere. We shall employ the definitions of Liou and 
Cheng (1995a) that dF^{x)fdx = /(x) = F{i) — F{i — 1) for i — 0.5 < x < i + 0.5, 
and dGn{y)/dy = g{k) = G{k) — G{k — 1) for k — 0.5 < y < k 0.5. The 
general expression (14) is simpler than the formula derived via the delta method 
used by Jarjoura and Kolen (1985). However, the variance and covariance estimates 
of Jarjoura and Kolen (1985) can be substituted into the right-hand side of (14) to 
find Var{l^). The standard error formula for the uniform kernel method when score 
distributions are smoothed using the log-linear models can also be expressed as: 

(15) [Var(e;)] J ^ {Var[Fu(x)] -k Var[Gn{(^)]~ 

where Fu and Gu denote the smoothed estimates of population parameters. Liou 
and Cheng (1995a) derived the variance and covariance estimates for (15) in slightly 
complicated forms. In this section, simplified variance and covariance estimates will 
be derived and substituted into (14) and (15) to estimate Var{^^) and Var{^^). 
Observed Score Distributions 

By substituting (5) and (6) into the respective (7) and (8), the following 
expressions retain: 

(16) F^{x) = 5Z[^/x(*b') + /x(*o|j)(a:-io + 0.5)]^(j) 

j *<*0 

= 3-nd 

3 

7 
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( 17 ) 



®u(y) = 9y(l:b’) +5)-(fcjb)(!/ - ^ + 0-5))M;') 

j h<ko 

= EGAyimn, 

j 

where Fx,{x\j) and Gy{y\j) denote the conditional distributions of x and y given 
j that have been continuized using the uniform kernel in the Forms —X and — Y 
groups, respectively. The reduced forms in (16) and (17) will simplify the estimates 
of variances and covariance for F^{x) and Gu(y)- The variance of (16) can be written 
as: 

(18) Var|/-„(x)l = CmlF,{x\j)h(j), F,{x\j')h{j% 

j jl 

Because F^{x\j) and h{j) have zero covariance, the covariances in (18) can be ex- 
pressed as: 

Cm [FxCibOMj). A(ib')*0')l 

=£(/i(xbbA(xbOft(jbMjO) - ■E(/i(xb)A(jO)£^[A(xbOM/^^ 

=i;(/i(xb)j;(xbO)B[A(j)A(;01 - B[A(x|;)]£|A(xbO)B[A(b)]i;W/)), 

which can be estimated by 

(19) Cox[F,(xb>(b), A(xb')ft(j')l 

“ Co^lA(xb), f'x(xb')lMb)Mi') + 

Cok(A(;')i *(/)) fi(xb)^x(xb') + 

Cov[F,{x\j), A(xb')]Cox(fc(;), A(b')). 

where F?[Fx(x1j)] and E[h{j)] are estimated by their empirical estimates. When 
j ^ j\ the covariance between F^{x\j) and Fx(x|/) vanishes, and Cov[h{j), h{j')] = 
-Mi)M/)/(Nx + Ny). When j = j', yar[Fx(x|j)] ^ Fx(x|j)[l - Fx(x|j)]/[(Nx + 
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l)^x(i)~l] (Jarjoura & Kolen, 1985, p. 158), and V’ar[^(j)] = A(j)[l — ^(j)]/(Nx + 
Ny). Therefore, (18) reduces to: 



( 20 ) 



Var[K(x)\ = 



F,(xU)[l - f,(lU)l 



+ 



~j‘ (Nx + i)Ax(i) ~ 1 

^X^(x|i) + 



£,2 



Nx + Ny 

-^x(x|j)[l - F^{x\j)]h{j){l - h{j)] 

[(Nx + l)Ax(i)-l](Nx + Ny) 

EE^^A(xU)f,(x|/) 

jYj/ -r i>ly 



}- 



Jarjoura and Kolen (1985, p. 147) [also Liou and Cheng (1995a)] used the equality 
F„(x) = 7-Fx(a:) + (1 — 7) Yi,j F^{x\j)hy{j) when deriving yar[Fu(x)]. Their formula 
involves the variances and covariance of F^{x) and J2j Fx{x\j)hy{j) and is slightly 
more complicated than (20). 

The variance of Gu(^u) can also be obtained by replacing F:^{x\j) with 
Gy(^u|j) in (20). Because F^{x\j) and Gy(^u|iO ^lero covariance for all j and 
j', the covariance between (16) and (17) becomes: 

(21) Con[A.(x),Gu(eu)] = C'onEFx(x|j)A(i),X:Gy(eu|j')M/)] 

3 y 

j^j' ^''x -t- iNy 

Equation (21) is also simpler than the covariance formula of Jarjoura and Kolen 
(1985, p. 147). By combining (20), (21) and Var[(?u(^u)], a simpler formula than 
that given by Liou and Cheng (1995a) for estimating Var{^^) is obtained. In prac- 
tice, both formulae will give similar estimates of VaJ'(^u) if the common-item score 



distributions do not deviate much from each other in the two groups. The denomi- 
nator g{^u) in (14) can be estimated by the sample relative frequency g{^n) evaluated 
S'! ~ ^u- 

Smoothed Score Distributions 

When score distributions are smoothed using the log-linear models, the 
marginal distributions of x and y can be expressed as: 

(22) A(i) = E[E/x(il2) + /x(ic.|2)(i-io + 0.5)lMj) 

j i<io 

= and 

j 

(23) Gn{y) = Z] ^y(^b') + 9y{ko\j){y - fco + 0.5)]h{j) 

j k<ko 

= Y,Gy{y\j)h{i), 

j 

where i^x(a:|j) and Gy{y\j) are the smoothed and continuized distributions of x and 
y given j in the Forms —X and — Y groups, respectively. The random variable 
Fx{x\j) is a function of and h{j) is a function of and ^y. Therefore, F^{x\j) 
is not independent of h{j). The same analogy applies to Gy{y\j) and h{j). Liou 
and Cheng (1995a) gave the complete expressions for V'ar[.Fu(a;)], Var[Gu(ifu)]i and 
C(w[F’u(a;), Gu(ifu)], which involve complicated summations over covariance terms. 
For instances, V'ar[.Fu(a;)] contains the estimates of Var[.Fx(a;)]i Var[J2j -^x(a;lj)^y(j)], 
and Cov[F-^{x), .Fx(a;|j)^y(j)] , each of which can be further decomposed into the 

sum of many covariance terms. A similar expression can be applied to V'ar[Gu(ifu)] 
and C'ou[.Fu(a;), Gu(ifu)]- Therefore, the estimation of Yar(^u) in (15) is computa- 
tionally tedious in practice. An interested reader may refer to Liou and Cheng 




15 



10 



(1995a) for the details of those formulae. 

When the sample sizes are reasonably large, the sample estimates of (22) 
and (23) can be closely approximated by 



(24) 


Kix) = E-Px(^b>(j). “d 

j 


(25) 


G'M = T,6y(y\3)hU), 






where h(j) is the empirical density of score j and converges to h(j) almost surely. 
In practice, if the assumed log-linear model holds for the population, then h{j) is 
also a consistent estimate of h(j) and the difference \h(j) — h(j)\ converges to zero. 
In other words, h(j) can be used to replace h(j) in (22) and (23) to obtain good 
approximations of and Gu- In smaller samples, the conditional distributions 
of X given v (or y given v) cannot be estimated precisely. Therefore, we need to 
smooth the distributions somehow to estimate the conditional distributions in the 
FE method. In the literature, researchers used a nonparametric smoothing method 
called the rolling weighted average frequencies procedure (e.g., Jarjoura & Kolen, 
1987) to estimate the conditional distributions. We may also consider (24) and (25) 
the parametric counterparts of the weighted procedure to estimate Fu(x) and Gu(y), 
respectively. 

Let denote the equipercentile equivalent computed using (24) and (25). Be- 
cause .Px(2;b) and h(j) have zero covariance, the variance of F*(x) can be derived 
cis follows: 

( 26 ) v<.r[F;(x)) = 

J i' 

= E E{c«'[j'.(xb), A(iU'))Mj>(/) + 

i i' 
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where 

(27) Cot>[/',(xb'),f,(i:|/)] 

^[dF(x\i)id^Cov(l,)\aF(x\j')iaii\,i^. 

The symbol T in (27) denotes the transposition of a matrix. The derivative of F{x\j) 
with respect to ^ can be expressed as 



dF{x\j)/dp, 

= 8{(E + Kio,j){x - ic + o.5)]/e ni,mm 

i<io i 

= [E + {x-io + 0.5)9/(io, j)/8A1/[E/('.7)1 - 

i<io i 

{[E f{hj)+f{ioJ){x - io + 0.5)]/[Y,f{iJ)?}[^df{iJ)/dPt], 

i<io i i 

for t = 1, • • • ,2^ + 1, where df{i^j)fdPt and Cov{^x) can be found in Holland and 
Thayer (1987). The variances and covariance of h{j) and h{j') are the same as that 
used in (20). By substituting (27), Var[h{j)], and Cov[h{j),h{j')] into (26), the 
formula of yar[F*(x)] can be derived. Likewise the variance of Gu(^u) can also be 
derived by replacing Fx{x\j) and with the respective Gy{^u\j) and Py in (26). 

Because .Fx(2:|7 ) and Gy(^u|7 ) are uncorrelated for all j, the covariance between 
F*{x) and G*((^u) can be derived as follows: 



Co,;[f„-(x).G;(&)l = Y, 

3 



- ^(j)i 

N, + N, 






EE 

3 ^ 3 * 



h{j)h(j') 

N, + N, 






( 28 ) 



The standard error of can be estimated by combining (26), (28) and Va?'[(ju(^u)]- 
The computational cost of Var{l^*) is approximately half the cost of Var(^u) given 
by Liou and Cheng (1995a). In larger samples, Var{^*) would be computationally 
more efficient and closely approximate V^ar(^u). In the empirical studies, an inves- 
tigation will be conducted to compare the difference between Var(^u) and Var(^*). 



Standard Error of The Gaussian Kernel Method 

Let = Cc(a^) = ^r^[-^c(a^)] and denote its sample estimate and smoothed 
estimate by |c and ^c, respectively. Because both Fc and Gc are twice differentiable 
at X and respectively, the Bahadur theorem (1966) can be applied to obtain the 
following large-sample approximations to the standard errors of |c and |c, 

(29) = {l^or[Fc(i)] + yar[Gc(^c)] - 
where ^(^c) = dGc{t)fdt evaluated at t = ^c, and 

(30) lV'»>'(y]i S {ya7-[ji(i)] + Var|G.(fJ] - 

2Co»[j'.(x),&({c)]}i/9(«c) 

(Liou & Cheng, 1995a). In a separate study, Holland, King and Thayer (1989, p. 10) 
derived standard error formulae for and |c via the delta method. Their formulae 
have similar expressions as (29) and (30). However, the variance and covariance 
estimates derived via the delta method are somewhat complicated especially for 
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Var(^c)- For instance, the kernel function in is a function of sample 

mean and variance [i.e., a function of /ix and or equivalently, a function of 
the discrete density f{i) in (9). Therefore, the estimate of Var[T’c(x)j involves the 
complicated derivatives of with respect to f{i) which is in turn a function 

of h{i\j)h{j) evaluated at = /x(i,i) in the FE method (Holland, King 

& Thayer, 1989). In this section, a simpler large-sample approximations to the 
standard error estimates of and are obtained which bypass the computations 
of these complicated derivatives. 

Observed Score Distributions 

From (9), the marginal distribution of x can be expressed as: 

(31) Fc(x) = 

i 

= E/i[*(«'.-x) + 0,(N-i)], 

i 

In (31), $(ti>ir) is a function of Wjx which is in turn a function of and dx. It is 
known that sample mean and variance converge to their population values almost 
surely. Therefore, $(ti>ij,) can be expressed as its population value plus a remainder 
term. By discarding the negligible 0p(N~2) term, the first-order variance of Fc(x) 
can be expressed as: 

(32) Var[Fc{x)] = '^Y^^{wix)^{wi>x)Cov[f{i),f{i')]. 

i i' 

By substituting (5) for the density f{i) and noting the zero covariance between 
fx{i\j) and h{j), the covariance factor in (32) can be written as: 

(33) cov[f{i),m] = c<»EA(ib')My).EA(>'b'')M/)l 

J j' 
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^ 1:1:1c oviui\j)JAm]kJ)kf) + 

i i' 

co«ihU)Mr)]mi)MiV) + 

CovlMiU), Mi'\j')]Cm>[hU), M;')]}. 

When j ^ j' , the covariance between fx{i\j) and fx{i'\j') vanishes for all i and i'. 
When j = j' and i = 



(34) 



Cov[f^{i\j)J^{i'\j')] = Var[f^{i\j)] 



(Nx + l)hx{j) - 1 



When j = j' and i ^ i' , 
(35) 



Co.,|A(iU),A(i'|/)| “ SMjW\j) 



(N, + - 1 

The variances and covariance of h[j) and h{j') have been given in (20). By combining 
(32) through (35), the formula for V ar[Fc{x)] can be derived. The variance of Gc(^c) 
can also be obtained by replacing x, /x(*|i) and f{i) with the respective 
gy{k\j) and g{k) in (32). Because fx{i\j) and gy{k\j) have zero covariance for all i 
and k scores, the covariance between Fc(x)and Gc(^c) can be expressed as: 



(36) 



co..|f:(x),G,te)] = EE«("'»)«('/’«.)Cox|/(i),s(i:)] 



t k 



= E E *(»<.)*(*J.){E ^m^T4^A(>'b')j.(*b) 



i k 



Nx + N, 



- EE^^A(i|;)s.(*^l/)} 

j^j, “T iNy 

The standard error of the Gaussian kernel method can be estimated by substituting 
(32), (36), and Var[(jc(^c)] into (29). The denominator g{^c) in (29) can be esti- 
mated by dGc{t)/dt evaluated at t = which is the corresponding Gaussian kernel 
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density estimate. In practice, and in (32) and (36) can be estimated 

by $(rw,a:) and respectively. 

Smoothed Score Distributions 

When score distributions are smoothed using the log-linear models, the 
estimates of f{i) and g[k) via the FE method have been given in (12) and (13) 
which can be approximated by replacing h[j) with h{j): 

(37) f{i) = /•(>■) = and 

3 

( 38 ) m = r(k) = T.9,mm, 

3 

respectively. Let be the equip ercentile equivalent of x derived based on (37) 
and (38). Then Var(^*) can be obtained in a similar manner to that for obtaining 
Var(^*). Similar to (32) the variance of F*(x) can be approximated by: 



(39) Var[F;(i)l ^'£11 *(t»i.)*(tOi..)Cov[r(i), /*(i')]. 

i i* 

Using the expression (37) and applying a similar covariance rule used in (19), an 
expression parallel to (33) for the covariance term in (39) can be obtained as : 



(40) cov(r(i), /•(>')] = EE{c»®lA(>iy).A(>'i9')iA(;')M/) + 

3 j' 

c<n,[hU)Mi')]fM)Um + 

crnihm, umicoviUj), k(j’)]}, 

where 



(41) 




W(i |j )/aA) Cov(|j idf(i'\j')/dis,r I , 
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and 



a/(i|j)/3/3. = 

l3/(i,j)/9Al/E/(i,j)] -{/(i,j)/E/(''j)l^}E3/('.j)/Wi], 

i i i 

for t = 1,...,29+ 1. The variance of h(j) and its covariance with h(j') have been 
given in (20). Likewise the variance of G*{^c) can be derived by replacing x, 
f*{i) and with respective 9*{^) and ^ in (39). 

Because fx{i\j) and gy{k\j') are uncorrelated for all j, j' , and A:, the covariance 
can be expressed as; 



(42) 



t k 



i k 



Nx + N, 



EEw^/xbbU(M/)} 

Consequently, the variance of can be derived by combining (39), (42) and l^or[G*(^c)]- 
The denominator g{ic) in (30) can be likewise estimated by dGc{t)fdt evaluated at 
t = (*. In larger samples, Var((*) is computationally more efficient relative to 
Var(^c)', its use will be further investigated in the empirical studies. 



Empirical Studies 



Empirical Study I 

The first dataset used in the empirical study was scores on two test forms (X 
and Y) of an English test, each of which consisted of 55 multiple-choice items. Both 
test forms were administered to 719 examinees. In addition, each examinee also 
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answered 25 common items. For each examinee taking the test, three scores were 
computed: one for each of the two 55-item forms plus a score on the 25 common 
items; these scores were simply the number of correct answers. Let z, k and j denote 
scores on Forms —X, — Y and common items, respectively. The bivariate {i,j) and 
(k,j) scores for the 719 examinees were separately smoothed using the log-linear 
models in (11). The likelihood ratio statistics (Little & Rubin, 1994) suggested 
that the model preserving the first four univariate sample moments and bivariate 
correlation yielded a better model-data fit to the observed data as compared with 
other log-linear models. Therefore, the two smoothed bivariate distributions using 
9 = 4 in (11) were assumed to be the population data from which the sample data 
were randomly selected. In this study. Form— X was structured to be equated to 
Form— Y, that is, a score equivalent on Form— Y was found for each integer score 
on Form— X. 

The Form— X groups of Nx = 100, 1,000 were randomly sampled from the 
smoothed (z, j) table, and Form— Y groups of Ny = 100, 1,000 were independently 
sampled from the smoothed {k,j) table. The bivariate (z, j) and (fc, j) distributions 
were estimated using sample data. The FE method was performed using bivariate 
sample distributions. When a marginal density on j (i.e., or hy) equalled zero in 
the sample, a rolling weighted average of frequencies procedure described in Jarjoura 
and Kolen (1987) was used to obtain nonzero estimates of fx{i\j) and 5y(fc|j) in (5) 
and (6), respectively. After the marginal densities on z and k were estimated via 
the FE method, the uniform and the Gaussian kernel methods were applied to solve 
for the equipercentile equivalent on Y for each integer score on X. The bandwidths 



in the Gaussian kernel method were selected to be constant values Bx = By = 1, 
and 3. Holland and Thayer (1987) discussed a data-adaptive choice of bandwidth 
via minimizing the sum of squared differences between continuized and empirical 
distributions at all the discrete scores. Livingston (1993b) empirically showed that 
the choice of bandwidths had essentially no effect on the bias of equating when the 
size of the bandwidth lies below a small value (e.g., Bx = By = 1.5). We will return 
to the issue of selecting an appropriate bandwidth for the Gaussian kernel method 
in the next section. 

The random sampling and equating procedures were replicated 100 times. In 
each replication, the standard errors of the uniform kernel method were estimated by 
substituting the estimates of yar[.Fu(3:^)]5 l^or[Gu(^u )]5 and C'ou[Fu(x), Gu(^u)] de- 
rived in (20) and (21) into (14) for x = 0, ...,55; the standard errors of the Gaussian 
kernel method were estimated by substituting the estimates of yar[Fc(2^)]j l^«^[Gc(^c )]5 
and Cov[Fc{x), Gc(^c)] derived in (32) and (36) into (29). The empirical standard er- 
ror for a given x was defined a^ the standard deviation of its equipercentile equivalent 
on Form— Y over the 100 replications. Figure 1 presents empirical standard errors 
and the averages of standard error estimates over the 100 replications at different 
X scores for the uniform kernel method. Because the simulated data contained few 
scores at the lower tail in the score distribution, equipercentile equating at the lower 
tail became unstable and inaccurate. For this reason. Figure 1 only contains plots 
of standard error estimates for x > 10. The results in the Figure indicate that the 
simplified formula in (14) gives reasonable estimates of standard error for the uni- 
form kernel method especially for larger samples (i.e., Nx = Ny = 1,000). Figure 1 
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also suggests that the standard error estimate using (14) is generally similar to that 
estimated by Equation (44) in Liou and Cheng (1995a) (see Page 280, Figure 3). 

Figures 2 and 3 contain plots of standard errors for the Gaussian kernel 
method with Bx = By = 1 and 3, respectively, for x > 10. It is noteworthy that 
the sparsity in the tails, especially the lower tail, of sample distributions yielded 
numerical inaccuracy in the standard error estimates, particularly when sample size 
is small, for both the uniform and Gaussian kernel methods. However, numerical 
accuracy of standard error estimates weis improved by increasing the size of the 
bandwidth to 3 for the Gaussian kernel method. It is also interesting to note that 
an increase of the bandwidth resulted in a decrease of the standard error for the 
Gaussian kernel method. Therefore, a larger bandwidth is recommended for the 
Gaussian kernel method in small sample equating where standard errors become an 
overriding consideration. 

The bivariate sample distributions on Forms —X and — Y were also smoothed 
using the log-linear model that preserved the first three sample moments and correla- 
tion in the bivariate (i,j) and (k,j) tables. This log-linear model weis recommended 
for sample equating in several empirical studies (Liou & Cheng, 1995b; Livingston, 
1993a). The smoothed densities on i and k under the common-item design were 
estimated via the FE method. The uniform and Gaussian kernel methods were then 
applied to find the equipercentile equivalent on Y for each of the scores on X. In the 
continuization phase using the uniform kernel method, both and its simplified 
version were solved for x = 0, • • • , 55 on Form— X. The empirical standard errors 
of the two score equivalents over the 100 replications are plotted in Figure 4 for the 
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Standard Errors of The Uniform Kernel Method Standard Errors of The Uniform Kernel Method 




Raw Scores on Form-X (a) 




Raw Scores on Form-X (b) 



Figure 1: Standard errors for the uniform kernel method in study I for (a) 

Nx = Ny = 100, and (b) Nx = Ny = 1,000. 
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Standard Errors of The Gaussian Kernel Method Standard Errors of The Gaussian Kernel Method 




Raw Scores on Form-X (a) 




Raw Scores on Form-X (b) 



Figure 2: Standard errors for the Gaussian kernel method in study I (Bx = By = 1) 
for (a) Nx = Ny = 100, and (b) Nx = Ny = 1,000. 
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Standard Errors of The Gaussian Kernel Method Standard Errors of The Gaussian Kernel Method 




Raw Scores on Form-X (a) 




Figure 3: Standard errors for the Gaussian kernel method in Study I (Bx = By = 3) 
for (a) Nx = Ny = 100, and (b) Nx = Ny = 1,000. 
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two sample sizes (Note: if* is refered to as the simplified estimate in the Figure). 
In each replication, the theoretical standard error of i^* was estimated by combining 
(26), (28) and yar[G*(^u)]. The averages of the standard error estimates over the 
100 replications are also plotted in Figure 4 for the two sample sizes. The results 
in Figure 4 indicate that the empirical standard errors of fu and f* are close to 
each other except for a few x scores at the lower tail. The simplified standard error 
formula gives reasonable estimates for both and if* when the sample size is large. 
When the sample size is small, however, the simplifed formula tends to overestimate 
the actual standard errors at the lower tail. Note that f* is computed using the em- 
pirical density ft(u); and f„, using the smoothed density h{v). It is known that h{v) 
converges to h{v) faster than does ^(v). Figure 4 also suggests that the empirical 
standard error of is slightly larger than that of f* at the lower tail, and similar 
to that of 1 ^* elsewhere. Therefore, ^* seems to be a better estimate than f„. 

In the continuization phase using the Gaussian kernel method, both i^c 
and (f* were solved for x = 0, ...,55 on Form— X. Figures 5 and 6 contain the plots 
of the empirical standard error estimates for and over the 100 replications 
for Bx = By = 1, and 3, respectively. Again, the standard error of f, is slightly 
larger than that of at the lower tail for smaller samples. However, the empirical 

standard errors of and if* do not differ significantly in larger samples. In each 

replication, the theoretical standard error of was estimated by computing (39), 
(42), and k^a?’[G*(^c)]- The averages of these theoretical estimates over the 100 
replications are also plotted in Figures 5 and 6. In general, the standard error 
estimates give close approximations to the empirical values for larger samples. In 
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standard Errors of The Uniform Kernel Method(Smoothed) Standard Errors of The Uniform Kernel Method(Smoof 




Raw Scores on Form-X (a) 




Figure 4: Standard errors for the smoothed uniform kernel method in study I for 
(a) Nx = Ny = 100, and (b) = Ny = 1,000. 
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Figure 6, the theoretical estimates with Bx = By = 3 slightly underestimate the 
empirical standard errors for smaller samples, except at the lower extreme whose 
overestimation was indicated. 

Empirical Study II 

The second dataset used in the empirical study was the 1990 National 
Assessment of Educational Progress (NAEP) reading data. The reading assessment 
for age 17/grade 12 consisted of 112 multiple-choice items. Item responses to the 
assessment items were collected from 9,229 examinees via a balanced incomplete 
block spiraling design (Johnson, 1992). Assessment items were calibrated using the 
three-parameter logistic models and one item was removed from the analysis due 
to a lack of fit to the model. In the empirical study, the assessment items were 
constructed into two test forms of 50 items each. The additional 11 items served as 
common-items for equating. Sample abilities of sizes 100, and 1,000 were randomly 
generated from a normal distribution with mean 1.051 and standard deviation .981 
which matched the scede of the original calibrated sample (Donoghue, 1992). The 
raw scores on the test forms and common-items were then simulated according to 
the three-parameter logistic model using estimated item parameters and random 
ability values. The sampling of random abilities and their raw scores on the two 
test forms and common-items were repeated 100 times. 

The equating and standard error estimation procedures performed in Study 
I were all replicated using the simulated NAEP sample data. In general, the empiri- 
cal and theoretical standard errors obtained from Study II do not differ significantly 
from those in Study I. Figures 7 through 9 contain the plots of empirical and theo- 
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Standard Errors of The Gaussian Kernel Method (Smoothed) Standard Errors of The Gaussian Kernel Method (Smoothed) 



5.0 




Raw Scores on Form-X (a) 




Figure 5: Standard errors for the smoothed Gaussian kernel method in Study I 
(Bx = By = 1) for (a) Nx = Ny = 100, and (b) = Ny = 1, 000. 
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Standard Errors of The Gaussian Kernel Method (Smoothed] Standard Errors of The Gaussian Kernel Method (Smoo 





Figure 6: Stemdard errors for the smoothed Gaussian kernel method in Study I 
(Bx = By = 3) for (a) Nx = Ny = 100, and (b) Nx = Ny = 1,000. 
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retical standard errors for equating smoothed distributions using different methods 
for X > 20. For smaller samples, the theoretical estimates for the two methods tend 
to overestimate standard errors at the lower tail; the Gaussian kernel method with 
larger bandwidth tend to underestimate standard errors at the middle and upper 
ranges of the score distributions. For larger samples, however, the theoretical esti- 
mates become reasonable. 



Final Remarks 

The empirical studies show that the standard error formulae given in this 
research perform reasonably well when sample sizes are as large as 1,000. In small 
sample equating, the standard error of the uniform kernel method is expected to 
be about the same size as that of the Gaussian kernel method with Bx = By = 1 
as have been examplified in Figures 1 and 2, 4 and 5, and 7 and 8, respectively. 
However, the standard error of the Gaussian kernel method is decreased to a small 
extent via using Bx = By = 3. We found that the density estimate of ^(^c) in fFe 
denominator of (29) is sensitive to the sparsity of data at the lower tails of score 
distributions and often results in an extremely inaccurate estimate of Var{^c)- An 
increase of bandwidths significantly improves the numerical accuracy in those stan- 
dard error estimates. However, it is noteworthy that the standard error estimates 
can be biased with large bandwidth when score distributions have been smoothed 
using the log-linear model. A similar empirical finding would be expected if the 
bandwidths were selected using the data-adaptive procedure suggested by Holland 
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Figure 7: Standard errors for the smoothed uniform kernel method in study II for 
(a) Nx = Ny = 100, and (b) Nx = Ny = 1,000. 




30 

35 



Standard Errors of The Gaussian Kernel Method (Smoothed] Standard Errors of The Gaussian Kernel Method (Smoo 





Figure 8: Standard errors for the smoothed Gaussian Kernel method in study II 
(Bx = By = 1) for (a) Nx = Ny = 100, and (b) Nx = Ny = 1,000. 
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Standard Errors of The Gaussian Kernel Method (Smoothed) Standard Errors of The Gaussian Kernel Method (Smoo 





Figure 9: Standard errors for the smoothed Gaussian kernel method in Study II 
(Bx = By = 3) for (a) Nx = Ny = 100, and ('bj Nx = Ny = 1, 000. 
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and Thayer (1989). We also found that a data-adaptive bandwidth via minimizing 
the squared difference between empirical and continuized distributions tended to be 
unstable in smaller samples. For instance, an extremely small bandwidth could be 
selected where a larger bandwidth was expected for equating very sparse distribu- 
tions (e.g., Bx = 0.007, and Nx = 100). With sparse data, it becomes a natural 
choice for the Gaussian kernel method to adopt the method of variable bandwidth 
that selects larger bandwidth for the lower-density region of score distributions and 
vice versa. However, the variable bandwidth is mathematically complicated with 
much involved calculations. In practice, a constant bandwidth via minimizing the 
weighted sum of squared differences {i.e.,X)i /(0[/(0 ~ /c(0]^} ^ useful com- 

petitor and remains to be investigated further. Both empirical studies show that 
the simplified estimates and contain smaller sampling error a^ compared with 
and respectively, especially at the lower tails of score distributions. Therefore, 
we also recommend the use of and in small sample equating. 
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